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Abstract 

We say that two finite words u and v are abelian equivalent if and only 
if they have the same number of occurrences of each letter, or equivalently 
if they define the same Parikh vector In this paper we investigate various 
abelian properties of words including abelian complexity, and abelian pow- 
ers. We study the abelian complexity of the Thue-Morse word and the Tri- 
bonacci word, and answer an old question of G. Rauzy by exhibiting a class 
of words whose abelian complexity is everywhere equal to 3. We also in- 
vestigate abelian repetitions in words and show that any infinite word with 
bounded abelian complexity contains abelian fc-powers for every positive in- 
teger k. 



1 Introduction 

It appears that very little is known on the abelian complexity of an infinite 
word lfT2] [TSl I22I . In fact, to the best of our knowledge, this paper may be 
the first time that the very notion of abelian complexity is formally defined. 
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This abstract provides a comprehensive study of the abelian complexity of an 
infinite word and its connection with other well-known word combinatorial 
notions. As it is intended as an extended abstract, most of the proofs of the 
results are omitted. 

We begin with a brief introduction outlining the key definitions relevant 
to the paper. We assume a certain familiarity with the basic notions in Com- 
binatorics on Words. In Section 3 we provide extremal values for the abelian 
complexity. In Section 4 we discuss a fundamental link between abelian 
complexity and balance of an infinite word, recalling in particular results con- 
cerning Sturmian words. Then in Section 5 we provide two answers to an old 
question of Rauzy by exhibiting two different classes of words whose abelian 
complexity is everywhere equal to 3. Sections 6 and 7 are devoted to the study 
of the abelian complexity of the Thue-Morse word and the Tribonacci word. 
In Section 8 we investigate a connection between abelian complexity and the 
presence of abelian A; -powers. In particular, as a consequence of the well- 
known van der Waerden's theorem, we deduce that an infinite word having 
bounded abelian complexity contains an abelian A: -power for every positive 
integer k. Section 9 contains a detailed study of abelian powers in Sturmian 
words, the Thue-Morse word and the Tribonacci word. 

2 Definition of the abelian complexity 

We assume the reader is familiar with basic results and notions of combi- 
natorics on words (for further information see, e.g., ifTTlfTSllTl I). Given an 
alphabet A, that is a finite non-empty set, we denote by A*, A^ and A^ re- 
spectively the set of finite words, the set of (right) infinite words and the set 
of biinfinite words over A. For a finite word u — 0102 ... a„ with n > 
(when n = 0, M is the empty word e) and a; e A, n is called the length 
of the word u and denoted For each a e A, let \u\a denote the number 
of occurrences of the letter a in u. Two words u and v in A* are said to be 
abelian equivalent, denoted u v, if and only if \u\a — \v\a for all a e A. 
It is readily verified that ^^'^ defines an equivalence relation on A* . 

Let u be an infinite word on the alphabet A, that is w = wo'^i ■ ■ • with 
each LOi in A. Any finite word of the form ojjWi+i • • • Wi+n^i (with i > 0) 
is called ?i factor of lo. Let Tu{n) denote the set of all factors of uj of length 
n, and set Pu{n) — Card(jF^(n)). The function p^^ : N ^ N is called 
the subword complexity function of lj. Analogously we define !F^{n) = 
^u,{n)/ -jj|3 and set 
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Definition 2.1. The function p^" = p^" : N ^ N which counts the num- 
ber of pairwise non abelian equivalent factors of uj of length n is called the 
abelian complexity or ab-complexity for short. 

In most instances, the alphabet A will consist of the numbers {0,1,2,..., 
k — 1}. In this case, for each m e we denote by ^'(w) the Parikh vector 
associated to u, that is 

= (l"|o, \u\2, ■ • ■ , |w|fe-l)- 

Given an infinite word oj E we set 
so that 

p^b(n)=Card(vI/^(n)). 

3 Extremal values 

A natural question concerns the possible extremal values of the abelian com- 
plexity. The following result due to Coven and Hedlund is a characterization 
of periodic words in terms of abeUan complexity: 

Lemma 3.1 (E.M. Coven and G.A. Hedlund, Qll Remark 4.07]). Let uj e 

(J ^ right infinite or a biinfinite word. Then uj is periodic of period 

p if and only if p^{p) = 1. 

The "only if" part is immediate. The converse follows from the obser- 
vation that a non-periodic word uj must contain arbitrarily long right special 
factors implying p^{n) > 2 for all n > 1. (Let us recall that a word it is a 
right special factor of an infinite word uj if for two different letters a and /3, 
the words ua and uj3 are both factors of uj.) 

Lemma 3.1 may be regarded as the abelian analogue of the celebrated re- 
sult of M. Morse, G.A. Hedlund EOl stating that a biinfinite word is periodic 
if and only if its subword complexity is bounded. Hence both the subword 
complexity and the ab-complexity may be used to characterize non-periodic 
biinfinite words. The situation for right infinite words is slightly different 
since in this case infinite words with bounded complexity correspond to ulti- 
mately periodic words (that is words of the form uv°° where denotes the 
periodic word with period \v\ obtained concatenating infinitely often v). In 
the rest of the paper, we will state results only concerning right infinite words 
although many of these results remain true in the context of biinfinite words. 

Concerning the maximal abelian complexity, it is clear that it is reached 
by any infinite word containing all finite words as factors, as for instance the 
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Champernowne word (which is obtained by concatenating all finite words 
enumerated with respect to the radix order). Let us denote p^^^^ the abelian 
complexity of such a word. Since, for any word u of length n over a fc-letter 
alphabet, ^'(it) is a fc-tuple (zi, . . . , i^.) with n = 11+12 + - • ■+ik, Pnmx '^^e 
maximum number of ways to write n as the sum of k nonnegative integers. 
This well-known number (see, e.g., If30l ) is called the number of composi- 
tions of n into k parts and is given by the binomial coefficient ("j^^7^) ■ '^^^ 
can be summarized as follows: 

Theorem 3.2. For all infinite words to over a k-letter alphabet, and for all 

n>0, 



In particular, the ab-complexity is bounded by 0{n''). 

We end this section with two examples illustrating some key differences 
between the behavior of the subword complexity and the abelian complexity. 
The first one was first pointed out to us by P. Amoux fS): Let oj denote the 
morphic image of the Champernowne word 



under the Thue-Morse morphism fi defined by i-^ 01 and 1 1-^ 10 Then 
while Pu){n) has exponential growth, we will see in Section|6]that p^{n) < 
3 for all n. 

The second example is to be contrasted with the first one: There exist 
binary infinite words having maximal abelian complexity but linear subword 
complexity. Indeed let / and g be the morphisms defined by /(a) = abc, 
f{b) — bbb, f{c) = ccc, g{a) = = g{c) and g{b) = 1. Let u! denote the 
fixed point of / beginning in a. Then the image of uj under g is the word 



It is readily verified that = p^^^^^. Since w is an automatic sequence, it 
has linear complexity (see Theorems 6.3.2 and 10.3.1 in [4j). 

4 Links with balance properties 

In this section we investigate a connection between abelian complexity and 
the notion of balance: Following 1 10| we say that an infinite word uj € is 
C -balanced (C a positive integer) if 1 1 [/ 1 a — | | a | < C for all a e A and all 
factors U and V of lo of equal length. A word lo is said to be balanced if it is 
1-balanced. It is easy to see that 




C = 01101110010111011110001001 . . . 



oj^i^'o^". 



i>0 
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Lemma 4.1. For a word uj G U A^, the function is hounded if and 
only if UJ is C-balancedfor some positive integer C. 

Let us recall that Sturmian words are precisely the binary aperiodic balanced 
words, where aperiodic means non ultimately periodic (see [8|). As no- 
ticed by G. Rauzy [22J. it is a consequence of the works by E.M. Coven and 
G.A. Hedlund that a word lo is Sturmian if and only if for all n > the 
cardinality of the set {|u|i | u G J-^{n)} is 2. In other words, we have the 
following characterization which is the earliest result we know involving the 
notion of abelian complexity. 

Theorem 4.2 (E.M. Coven, G.A. Hedlund, |12|). Let W be an aperiodic 
binary right infinite word. Then W is balanced (i.e., W is a Sturmian word) 
if and only if p^^{n) — 2 for all n>\. 

Let us note that I. Kabore and T. Tapsoba also characterized using abelian 
complexity the family of so-called quasi-Sturmian words by insertion (a sub- 
class of the class of infinite words over a three-letter alphabet having subword 
complexity n + 2) ifTSll . These words defined over a ternary alphabet verify 
p^^{n) = 2 for n 7^ even and p^^{n) = 4 for n 7^ 1 odd. 



5 Two Answers to a Question of G. Rauzy 

Inspired by the characterization of Sturmian words of Theorem |4.2| G. Rauzy 
asked whether there exist aperiodic words on a 3-letter alphabet such that 
p^^(n) = 3 for all n > 1. Let p > 3 be any integer, let w' be any Sturmian 
word over {0, 1} and let a> = (p — l)(p — 2) • • • 2lu' [uj is written over the 
alphabet {0, 1, . . . , (p — 1)}. As a consequence of Theorem 4.2 we can see 
that p^{n) = p for all n > 1 (in particular when p = S). This provides a first 
answer to G. Rauzy's question. Nevertheless it is not completely satisfactory 
since w is not recurrent (an infinite word is recurrent if each of its factors 
occur infinitely often in w). We end this section by exhibiting two families 
of uniformly recurrent words whose ab-complexity is everywhere equal to 3. 
Next results will provide answers including uniformly recurrent word (let 
us recall that an infinite word is uniformly recurrent if each of its factors 
occurs infinitely often with bounded gaps). The first one generalizes partially 
Theoreml4.2l 



Theorem 5.1. Let lu be an aperiodic balanced word on a 3-letter alphabet. 
Then the ab-complexity p^^{n) = 3 for all n > 1. 

Theorem 5. 1 is a consequence of a characterization of aperiodic balanced 
words due to R Hubert in lfT4ll . 
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The next theorem illustrates that the converse of Theorem 15.11 does not 
hold: 

Theorem 5.2. Let J e {0, 1}^* he any aperiodic infinite word, and let u) be 
the image of lo' under the morphism f defined by i-^ 012, and 1 021. 
Then p^^{n) = 3 for all n>l. 

It would be interesting to find a characterization of all recurrent infinite 
words with constant ab-complexity equal to 3. A related question is to find a 
recurrent word with constant ab-complexity equal to 4. We suspect no such 
word exists. Using [14], it can be shown that there does not exist a recurrent 
balanced word with constant ab-complexity equal to 4. 



6 The ab-complexity of the Thue-Morse word 

In Section|3] we announced that the image of the Champemowne word under 
the Thue-Morse morphism fi has an ab-complexity bounded by 3. More 
generally: 

Theorem 6.1. The abelian complexity of an aperiodic word uj G {0, 1}^ is 

( pab(n) = 2 for n odd, 

1 p'^{n) = 3 for n^Q even, 

if and only if there exists a word uj' such that uj = fJ.{uj'), uj — 0/i(aj') or 
oj = lfj,{uj'). 

As a direct consequence we get the ab-complexity of the Thue-Morse 
word TMo, the fixed point of /i beginning in 0. 

Theorem 6.2. Pxmo (^) ~ 2/or n odd and PxMo ~ n ^ even. 

It is quite remarkable that the previous result follows only from the action 
of the Thue-Morse morphism. Let us note that a similar situation holds when 



considering the proof of Theorem 5.2 



Note also that Theorem 6. 1 characterizes the class of all words having the 
same abelian complexity as the Thue-Morse word. It is known (|[T1) that ev- 
ery recurrent infinite word w S {0, 1}^ whose subword complexity is equal 
to that of TMq is either in the shift orbit closure of TMq or is in the shift 
orbit closure of A(TMo) where A is the letter doubling morphism defined 
by i-^ 00 and 1 i-^ 1 1 . As a consequence we deduce that: 

Corollary 6.3. A binary infinite word has the same subword complexity and 
ab-complexity as the Thue-Morse word if and only if it is in the shift orbit 
closure of TMq. 
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To end this section, let us observe that Theorem 6. 1 is false when the ape- 
riodicity hypothesis is removed. Indeed observe first that the word (01)°° = 
/Lt(0°°) does not have the same abelian complexity as the Thue-Morse word. 
Secondly, we let the reader verify that the ultimately periodic word 0110(1001)'^ 
has the same abelian complexity as TMq. 



7 The ab-complexity of the Tribonacci word 

As we have already seen, the notion of abelian complexity has many links to 
the notion of C-balance. The results of the previous section are partially due 
to the fact that the image under the Thue-Morse morphism of any recurrent 
infinite word is 2-balanced. We now investigate the ab-complexity of another 
well-known 2-balanced word, the so-called Tribonacci word 

t ^ r'^(O) = 01020100102010 • • • 

defined as the unique fixed point of the morphism t 

01 1 02 2 i"> 0. 

Theorem 7.1. Let p^^ denote the ab-complexity of the Tribonacci word t. 
Then, p^^{n) € {3, 4, 5, 6, 7} for every positive integer n. Moreover, each of 
these five values is assumed. 

Proof. It is well-known that for a\\ n > 1, t has exactly one right special 
factor of length n — 1, and that, for this special factor that we denote t^_i, 
the three words t^_^0, t^_j^l, and t^_i2 are each factors of t of length n. 
Define non-negative integers i, j, k by ^'(t^_^) — {i, j, k). Setting 

Central(n) = + l,j,k), {i,j + l,fc), {i,j,k+ 1)} 

we have 

Central(n) C ^t{n). (1) 

Given a vector it = (a,/?, 7), let denote \\~v^\\ = max(|Q;|, \(3\, I7I). 
Observe that the set of vectors ~v' such that — < 2 for all ~u in 
Central(n) is described by the graph of Figure [T| (where vectors are vertices 
of the graph, and each edge (1?, ~u) denotes the fact that | |l7 — Tfj | ~ 1). 

Since t is 2-balanced, ^'t(?^) is a subset of this set of twelve vectors. 
Moreover for the same reason, we should have — ~u || < 2 for all 1?, 
in ^'t(«). This implies that the only possibility for ^'t(?^) is to be a subset 
of one of the three sets delimited by a regular hexagon in Figure [T] or one of 
the three sets delimited by an equilateral triangle of base length 2. These sets 
have cardinalities 7 and 6 respectively showing that (n) < 7. 
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Figure 1 : Links between Parikh vectors 



By computer simulation we find that 

(p^^(n))„>i = 334344434444443444444444444345544444554444 . . . 

In particular, the least n for which p'^{n) = 5 is for n — 30. We also found 
that the smallest n for which p^^(rt) = 6 is rt = 342, and the smallest n 
for which p^^(n) — 7 is n = 3914. The next four values of n for which 
= 7 ai-en = 4063, 4841, 4990, 7199. □ 

It is surprising to us that the value pl^{n) = 7 does not occur until 
n — 3914, but then re-occurs relatively shortly thereafter There are many 
interesting and mysterious properties observed in the behavior of the Tri- 
bonacci word: For instance, for all n < 184, if C/ and V are factors of t of 
length n, with U a prefix of t, then ||;7|a - |V^|a| < 1, for all a g {0, 1,2}. 
But then this fails for n = 185. The intererested reader will find in 1251 a 



proof that pt^{n) = 3 if and only if t has a bispecial factor of length n — 1. 
It is also proved in this paper that the abelian complexity of t attains the value 
7 infinitely often. It is an open question to find a proof that values 4, 5 and 6 
are also attained infinitely often. 

To end this section, we would hke to stress the importance of the 2- 



balance of the Tribonacci word to prove Theorem 7. 1 Although this result is 
cited in numerous articles, we were unable to find a proof of this fact in the 
literature. We wrote a combinatorial proof in 1251 . We also have a proof of 
this fact that uses the spectral properties of the adjacency matrix associated 
to the generating morphism f;24ll . 
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8 Links with abelian powers 



We now consider abelian powers. Repetitions occurring in an infinite word 
is a topic of great interest having applications to a broad range of areas (see, 
e.g.. Ill |3] |5] [HI). One stream of research dating back to the works of 
Thue Il28ll29l is the study of patterns avoidable by infinite words (see, e.g., 
|[l7l[l8l|26l|27l|2l). In the abelian context, F.M. Dekking |13| showed that 
abelian 4-powers are avoidable on a 2-letter alphabet and that abelian cubes 
are avoidable on a 3-letter alphabet. V. Keranen |fT6l proved that abelian 
squares are avoidable on four letters. Let us recall that an abelian k-power 
is any non-empty word on the form W = U1U2 ■ ■ ■ Uk where Ui Uj for 
all l<i,j <k. 

Theorem 8.1. Any infinite word having bounded abelian complexity contains 
an abelian k-power for every positive integer k. 

This theorem could be considered as an abelian analogue of the cele- 
brated result by M. Morse and G.A. Hedlund: "infinite words with bounded 
subword complexities are ultimately periodic". The proof of Theorem 8. 1 
uses van der Waerden's theorem. 



Theorem 8.1 raises natural questions: Is it true that any recurrent infi- 
nite word having a bounded abelian complexity has the property that each 
position begins in an abelian fc-power? What about the case of uniformly 
recurrent word? Note that here the requirement that the abelian complexity 
be bounded is important. Indeed, in fTT| F.M. Dekking showed that the fixed 
point of the morphism Oil and 1 1-^ 0001 is abelian 4-power free (this 
word is recurrent since the morphism is primitive). 

This problem seems difficult since we do not know the answer even in 
the special case of the Thue-Morse word. 



9 Abelian repetitions in Sturmian words 



When considering stronger hypothesis than in Theorem 8. 1 one can naturally 



expect to have a stronger result. This is what happens in Theorem|9. 1 |below 



dealing with Sturmian words, that is by Theorem 4.2 words having abelian 
complexity 2 everywhere: 

Theorem 9.1 ([23,]). For every Sturmian word to and every integer fc > 1, 
there exist two integers i\ and £2 such that each position in to has an occur- 
rence of an abelian k-power with abelian period £1 or £2- 



Note that the situation in Theorem 9.1 is different than for usual powers 



Indeed every Sturmian word begins with infinitely many square, but not nec- 
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essarily with a cube |9|. Sturmian words are optimal in the following sense: 
In the next theorem, we say that w has abelian period p if w — xiX2 ■ ■ ■ Xn 
for some pairwise abelian equivalent words Xi with p — \xi \ ~ ■ ■ ■ — |x„ |. 

Theorem 9.2. // an infinite word x is abelian k-repetitive such that every 
position starts with an abelian k-power with a fixed abelian period m, then 
X is ultimately periodic. 

Remark 9.3. The property mentioned in Theorem |9.1| is not characteristic 
for Sturmian words. Indeed if t is a Sturmian word and / is the morphism 
defined by /(a) = aa, f{b) = ab, the word /(t) is not Sturmian and verifies 
this property. More precisely if every position of t starts with an abelian k- 
power of abelian period either £i or £2, then every position of /(t) starts with 
an abelian fc-power of abelian period either 2ii or 2^2. Considering instead 
of / the morphism g defined by g{a) = C1C2 ■ ■ ■ c„a and g{b) = C1C2 • • • c„6 
with ci , . . . , c„ letters, one can find non ultimately periodic words over arbi- 
trary alphabet having the previous property. 

We end this section with two further results on abelian repetitions: 

Theorem 9.4. For all integers k > I, each suffix of the Tribonacci word 
begins in an infinite number of abelian k-powers. 

We do not know whether this holds for all words in the subshift generated 
by the Tribonacci word. For the Thue-Morse word, we have 

Theorem 9.5. Each suffix of the Thue-Morse word begins in an abelian 6- 
power 

We do not know whether this holds for abelian 7-powers. 
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